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ABSTRACT 


DISE  (Discrimination  and  Identification  of  Seismic  Events)  is  an 
interactive  computer  program  with  graphics  support  and  currently  runs  on 
a  VAX-11/780  computer  at  the  SDAC.  Using  various  commands  which  are 
available,  the  seismic  analyst  may  employ  location  data  or  waveform 
measurements  to  identify  unknown  events.  Groups  of  epicenters  may  be 
formed,  and  a  lower  level  of  subgroups  is  formed  when  particular 
stations  or  variables  are  selected  for  discrimination  purposes.  The 
program  supports  two  basic  approaches  to  event  identification  using 
waveform-derived  data:  multivariate  discriminant  functions  or 

multivariate  clustering. 
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INTRODUCTION 


The  identification  of  the  nature  of  seismic  events  in  a 
sei smological  context  is  based  on  the  examination  of  location 
coordinates  and  depth  and  on  inferences  from  the  recorded  waveforms. 
These  two  types  of  data,  i.e.  "location"  and  "waveform",  are  distinct 
and  are  employed  in  different  manners  in  the  decision  process.  Although 
they  complement  and  reinforce  one  another,  each  may,  by  itself,  be 
sufficient  to  make  a  fairly  reliable  decision  on  the  nature  of  an  event. 
Ideally,  they  will  be  used  together;  and  any  complete  analysis  procedure 
will  include  both. 

A  rational  strategy  for  identification  of  unknown  seismic  events  is 
outlined  in  Figure  1.  This  strategy  presumes  that  location  information 
and  certain  waveform  measurements  are  initially  available  to  the 
analyst.  The  outcome  of  the  analysis  is  a  decision  on  the  event  type, 
if  possible,  and  a  confidence  measure  placed  on  that  decision.  The 
reader  should  not  concentrate  on  particular  procedural  statements  in 
Figure  1  because  these  will  be  made  clear  in  the  following  section; 
rather  the  overall  scheme  should  be  examined.  This  scheme  involves  many 
branches  and  loops  characteristic  of  complex  decision  processes.  In 
fact  many  more  possible  paths  than  exhibited  here  would  exist  in  an 
operational  sense  provided  a  sufficiently  flexible  program  existed  for 
implementing  the  concepts  of  Figure  1. 

This  report  describes  an  interactive  program  which  implements  the 
scheme  of  Figure  1  and  which  has  been  coded  and  installed  on  a  VAX 
11/780  computer  at  the  Seismic  Data  Analysis  Center.  This  program  has 
the  title  "Discrimination  and  Identification  of  Seismic  Events",  or 
briefly  "DISE" ,  an  acronym  which  emphasizes  the  role  of  probability  and 
statistics  in  the  decision  process.  This  program  fulfills  two  important 
current  requirements: 

1)  provision  of  an  interactive  and  graphical  program 

for  studying  and  optimizing  the  event  classification 
properties  or  features  of  seismic  data  and 
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MODULE  1 
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MODULE  2 


DATA  SELECTION 


Figure  1.  continued.  Logic  flow  in  an  interactive  discrimination  mod< 
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MODULE  3 


DATA  ANALYSIS 
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MODULE  4 


Figure  1.  continued.  Logic  flow  in  an  interactive  discrimination  mode. 


2)  provision  of  a  prototype  program  to  routinely  identify 
events  contained  in  a  seismic  bulletin  in  an  efficient 
manner  using  all  appropriate  seismic  data  input. 

The  background  for  Figure  1  consists  of  twenty  years  of  research 
into  seismic  event  identification  under  the  VELA-Uniform  program,  as 
summarized  in  Rivers  et  al .  (1981).  That  research  is  composed  largely 
of  disjointed  studies  on  limited  data  bases  using  one  particular 
discrimination  technique;  but  more  recently  involved  multivariate 
discrimination,  culminating  in  a  ponderous  experiment  on  133  Asian 
events  by  several  contractors.  A  thorough  evaluation  of  this  recent 
experiment  is  also  contained  in  Rivers  et  al.  (1981).  The  scheme  of 
Figure  1  embodies  the  statistical  techniques  used  in  the  Asian  event 
discrimination  experiment  and  in  fact  would  allow  for  a  duplication  of 
many  of  the  results  of  that  experiment,  given  the  same  data  base. 

Note  that  the  procedure  outlined  in  Figure  1  does  not  include  the 
actual  location  of  a  seismic  event  or  waveform  measurements.  These 
functions  occur  outside  the  formal  framework  of  the  present  DISE 
program;  however,  future  versions  of  the  program  will  be  linked  into  a 
larger  seismic  analysis  program  currently  being  coded  for  the  VAX  under 
separate  contract,  and  the  capability  of  refining  locations  or 
remeasuring  waveform  variables  within  the  discrimination  procedure  would 
then  exist.  This  full  capability  is  deemed  essential  to  a  truly 
efficient  and  reliable  discrimination  effort. 

The  remainder  of  this  report  consists  mainly  of  a  description  of  the 
DISE  program  and  its  environment.  The  description  of  the  program  is 
presented  in  a  format  which  lists  and  explains  the  purpose  of  each 
command  available  to  an  analyst  using  the  program.  It  is  intended  to  be 
a  functional  description  of  the  program  as  it  existed  at  the  time  this 
report  was  prepared.  Detailed  documentation  of  the  actual  coded  modules 
which  comprise  the  DISE  program  is  available  from  the  SDAC  on  request. 
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DESCRIPTION  OF  THE  DISE  PROGRAM 


Functional  Summary 

The  DISE  computer  program  is  a  seismic  analysis  tool  to: 

1)  study  and  optimize  the  classification  properties 

or  features  of  seismic  data  from  natural  earthquakes 
and  artificial  seismic  events,  i.e.,  explosions, 

2)  identify  events  contained  in  a  seismic  bulletin 
as  earthquakes  or  explosions,  or  otherwise,  in  a 
rapid  and  routine  manner. 

The  procedure  for  accomplishing  these  tasks  is  embedded  in  an 
interactive  analysis  mode  with  graphics  aids  and  statistical  support 
routines.  The  basic  input  to  the  program  is  a  seismic  bulletin  listing 
event  hypocenters  and  their  associated  seismic  arrivals  at  various 
recording  stations.  Measurements  on  these  arrivals  are  assumed  to  be 
present  in  this  bulletin;  and  the  program,  as  presently  configured,  does 
not  allow  for  making  further  measurements  or  changing  those  already 
available. 

The  DISE  program  permits  the  definition  of  event  groups  within  the 
original  set  of  events.  The  defining  criteria  are: 

a)  location, 

b)  depth, 

c)  magnitude, 

d)  event  type, 

e)  date. 

Editing  procedures  are  available  for  refining  the  event  list 
corresponding  to  a  given  group. 

Furthermore,  subgroups  can  be  defined  from  their  respective  groups 
by  the  following  criteria: 

a)  station, 

b)  variable. 


The  DISE  program  permits  application  of  the  following  discrimination 
techniques: 

a)  event  location, 

b)  event  depth, 

c)  multivariate  discrimination  using  waveform-derived  data, 

d)  clustering  using  waveform-derived  data. 

The  DISE  program  provides  the  following  interactive  graphics 
displays  to  aid  the  discrimination  process: 

1)  epicenter  and  station  locations  on  coastline  and  political 
boundary  map  background, 

2)  hypocenters  projected  on  vertical  cross-sections, 

3)  scattergrams  and  histograms  for  discrimination  variables. 

The  interfacing  of  the  functioning  software  and  hardware  components 
of  DISE  is  shown  in  Figure  2.  In  addition  to  the  VAX  itself,  the  other 
hardware  components  which  are  employed  are  the  Graphicus-80  CRT  provided 
by  Vector  Automation,  Inc.,  an  Intercolor  A/N  terminal,  a  Talos  data 
tablet,  a  DEC  printer,  and  a  Versatec  hardcopy  unit.  The  analyst 
controls  the  flow  of  data  and  processing  through  the  A/N  terminal  by  a 
set  of  commands  with  qualifiers.  He  also  controls  the  data  displays  on 
the  Graphicus-80  through  its  keyboard  and  the  data  tablet. 


Data  Organization 


Knowledge  of  how  the  DISE  program  organizes  and  stores  the  data  is 
essential  to  a  proper  understanding  of  the  commands  available  to  the 
analyst.  The  data  base  at  the  beginning  consists  of  measurements  made 
at  certain  stations  for  certain  events.  Thus,  in  statistical  terms, 
there  are  multiple  variables  (seismic  measurements)  for  multiple  cases 
(events).  The  analyst  first  designates  groups  of  cases  (events)  on  the 
basis  of  their  location,  magnitude,  or  event  type  (e.g.  explosion  and 
earthquake).  Events  of  unknown  type  will  be  formed  into  another  group. 
Then  for  each  group,  the  analyst  may  want  to  define  various  subsets  of 
variables.  These  variables  may  be  chosen  from  only  one  station  or  from 
several  stations,  resulting  in  an  averaging  procedure  over  the  stations 
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to  produce  network  variables.  Each  definition  of  a  subset  of  variables 
involves  both  variable  and  station  selection  and  creates  a  subgrouping 
of  data  within  the  original  data  available  for  the  group.  The  subgroup 
contains  all  the  cases  (events)  of  the  original  group  but  only  a  clearly 
defined  subset  of  the  variables.  Multiple  subgroups  can  be  created  for 
each  group  depending  on  the  analysis  strategy. 

While  location  analysis  will  be  handled  at  the  group  level, 
statistical  discrimination  procedures  will  be  performed  at  the  subgroup 
level . 

Interactive  Discrimination  Commands 

The  program  flow  is  controlled  by  analyst  commands  entered  through 
the  A/N  keyboard  terminal.  Presently  the  command  syntax  is  unflexible, 
and  command  strings  consist  of  variables  separated  by  commas.  A  list  of 
the  available  commands  and  a  brief  description  of  their  intended  usage 
is  given  in  Table  I.  A  detailed  description  of  each  command  in  Table  I 
is  contained  in  the  following  subsections. 
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TABLE  I 

Interactive  Discrimination  Commands 


Command 

Name 

INIT 

GETE 

SELE 

MAPE 

CROS 

FRMG 

TRSG 

DELG 

SIJMG 

SDAT 

SCAL 

SIML 

CLUS 

SPLT 

XYPL 

HGPL 

CDIS 

ADIS 

SDIS 

LSTG 

LSTS 


Purpose 

Initialize  the  program  and  set  up  data  base. 

Select  an  event  to  discriminate  from  master  event 
set . 

Get  all  epicenters  from  master  event  set  within 
prescribed  selection  criteria. 

Display  epicenters  on  map  background. 

Perform  a  transformation  to  put  epicenters  on  a 
vertical  cross  section. 

Create  a  working  set  (group)  of  events  taken  together 
as  a  group. 

Transfer  a  working  set  (group)  into  temporary  set. 
Delete  a  working  set  (group). 

Summarize  in  tabular  format  the  data  availability  for 
each  group. 

Select  data  according  to  event  groups,  stations,  and 
variables  for  use  in  statistical  analysis. 

Apply  scaling  relations  to  remove  magnitude-related 
bias  of  data  among  groups. 

Compute  similarity  measure  of  a  given  event  to  pre¬ 
viously  formed  groups,  or  similarity  of  groups. 

Perform  cluster  analysis  on  selected  groups  of 
events . 

Reform  the  event  groups  on  the  basis  of  clustering 
results . 

Form  and  display  plots  of  any  two  selected  variables. 

Display  histogram  of  a  given  variable  from  a  selected 
subgroup . 

Compute  the  multivariate  discriminant  function 
between  two  subgroups  of  data. 

Apply  discriminant  function  to  unknown  event(s). 

Sum  discriminant  functions  over  stations. 

Display  the  attributes  of  all  current  groups. 

Display  the  attributes  of  all  current  subgroups. 


LSTD 


Display  information  pertaining  to  current 
discrimination  functions. 


I N I T :  Program  Initialization 


Presently,  this  command  is  automatically  invoked  at  the  start  of  the 
program  execution  because  the  data  sets  are  prepared  external  to  the 
program.  One  of  these  data  sets  is  the  master  variable  file  which  is 
organized  as  shown  in  Table  II  and  retained  on  disk.  All  data  for  a 
single  event  is  grouped  sequentially  in  this  file.  Another  data  set  is 
read  into  core  from  disk  and  is  organized  as  shown  in  Table  III  into  a 
common  area.  Note  that  a  pointer  to  the  master  variable  file,  which  is 
random-access,  is  defined  for  each  event. 
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TABLE  II 


FORMAT  OF  THE  MASTER  VARIABLE  FILE 
Parameter  Storage  Mode 


Event  Number  1*2 
Station  Number  1*2 
Variable  Number  1*2 
Amplitude  R*4 
Magnitude  R*4 
Distance  to  source  R*4 
Azimuth  from  source  R*4 


TABLE  III 

FORMAT  OF  THE  MASTER  EVENT  COMMON  AREA 


Parameter  Storage 


Event  Number  1*2 
Pointer  to  master  bulletin  1*2 
Pointer  to  master  variable  file  1*2 
Flag  for  temporary  display  1*2 
Group  Number  1*2 
Cluster  Number  1*2 
Event  type  L*1 
Date  1*4 
Origin  Time  1*4 
Latitude  R*4 
Longitude  R*4 
Depth  R*4 
Magnitude  R*4 
Major  semiaxis  length  R*4 
Minor  semiaxis  length  R*4 
Aximuth  of  major  axis  R*4 
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Mode 


PETE:  Select  one  epicenter 


This  function  selects  a  single  hypocenter  for  the  purpose  of 
identification  and  displays  all  the  available  data  for  it  to  the 
analyst.  Input  consists  of  analyst-supplied  date  and  origin  time.  A 
search  is  made  through  the  master  event  set  for  the  event,  it  is  flagged 
as  a  "temporary  event  set"  and  the  epicenter  data  for  the  event  is 
displayed  at  the  analyst's  terminal. 

SELE:  Select  a  group  of  epicenters 

This  function  enables  the  analyst  to  select  a  group  of  events  on  the 
basis  of  their  hypocenter  attributes.  Such  a  group  will  often,  but  not 
always,  assume  the  role  of  a  "training"  set  for  the  purpose  of 
identifying  unknown  events.  The  analyst  supplies  latitude,  longitude, 
depth,  magnitude,  and  date  limits  to  be  used  as  selection  criteria.  The 
analyst  also  specifies  event  type,  either  explosion  or  earthquake,  or 
otherwise,  with  the  default  being  unidentified  events.  A  search  is  made 
through  the  master  event  set  for  those  events  satisfying  the  selection 
criteria,  and  selected  events  are  flagged  as  a  "temporary  event  set".  A 
summary  of  the  selection  results  is  listed  on  the  A/N  terminal; 
specifically,  the  analyst  sees  the  number  of  events  found  and  the  actual 
events  which  satisfied  the  specified  selection  criteria. 

MAPE:  Hap  of  epicenters 

This  function  presents  the  analyst  with  a  graphics  display  of  the 
epicenter(s)  in  the  temporary  event  subset.  They  are  displayed  in  plan 
view  or  on  a  map  with  background  of  coastlines  and  political  boundaries. 
The  analyst  enters  a  map  projection  option,  whether  or  not  confidence 
limits  on  latitude  and  longitude  are  to  be  shown  about  the  epicenters, 
and  whether  or  not  presently  displayed  epicenters  are  to  be  retained  v,n 
the  screen  in  addition  to  those  of  the  new  temporary  event  subset.  The 
required  spatial  limits  of  the  map  are  computed  and  the  appropriate 
sections  of  the  coastline  and  political  boundary  files  are  accessed  to 
construct  the  map.  A  transformation  on  latitude  and  longitude,  if 
required,  is  made  to  satisfy  the  analyst's  projection  option.  The 
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epicenter  coordinates  are  similarly  transformed  before  displaying. 
Confidence  ellipses,  if  required,  are  read  from  the  master  event  common 
area  and  displayed  about  the  epicenter ( s) .  The  visual  output  consists 
of  the  graphics  display.  A  map  bookkeeping  record  is  updated  to  reflect 
current  map  limits,  map  projection  type,  and  subsets  of  data  being 
displayed.  A  "pick”  attribute  is  associated  to  each  epicenter  on  the 
screen  for  later  editing  purposes. 


CROS;  Cross-section  of  epicenters 

This  function  converts  a  plane  view  of  hypocenters  to  a  display  on  a 
plane  normal  to  the  earth's  surface  to  provide  visual  aid  in  depth 
analysis.  The  analyst  supplies  the  orientation  of  the  plane  to  which 
the  hypocenters  are  projected  and  the  maximum  depth  of  hypocenters  to  be 
displayed . 

Let  «  be  the  specified  angle  of  the  projection  plane  from  north  (0  <« 
<  tt/2).  Let  (  0  ,  d>)  be  the  colatitude  and  east  longitude  of  the  centroid 
of  epicenters  to  be  projected.  Consider  then  the  spherical  triangle  in 
Figure  3  with  angles  A,  B,  and  C  and  sides  a,  b,  and  c  (in  radians). 
Let  the  projection  of  an  epicenter  at  (9e*  <P  be  normal  to  the  plane  so 
that  C  s  it /2 .  Using  the  law  of  sines: 

sin  a  =  sin  c 
sin  A  sin  C 

Since  C  is  a  right  angle,  we  have  from  spherical  trigonometry: 


or 


sin  a  =  sin  c  •  sin  A 

a  =  arcsin  (sin  c  •  sin  A) 


Thus  a  is  represented  in  terms  of  known  quantities  since  c  and 
can  be  computed  from  (9,  41)  and  ■  (0g,  4  g)  •  Again,  from 
trigonometry: 


A  = | p-a | 
spherical 


sin  b  =  sin  B  •  sin  c 
cos  B  =  tan  a  •  cot  c 
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2  2 
Using  cos  B  +  sin 


B  =  1  gives 


sin  b 


.^f 


2 

tan  a 


2 

cot  c 


sin  c 


or 


b  =  arcsin 


2  2 
tan  a  cot  c 


sin  c ] 


This  gives  the  horizontal  distance  along  the  plane  where  the  given 
epicenter  lies  relative  to  the  reference  centroid  point.  With  all 
epicenters  projected  in  this  manner,  the  cross-section  view  is  displayed 
on  the  graphics  screen.  The  confidence  limits  on  depth  and  on  the 
horizontal  coordinate  are  displayed  if  requested.  Since  the  projection 
plane  lies  at  an  orientation  «  from  north  and  the  confidence  ellipse  at 
an  orientation  y,  the  appropriate  confidence  limit  for  the  projected 
epicenter  is  the  intersection  of  a  line  parallel  to  the  plane  with  the 
ellipse  as  shown  in  Figure  4.  Let  6  =  |y  -  0 1  )  and  p  and  q  be  the  major 
and  minor  semi-axis  lengths  of  the  ellipse,  respectively.  Then  L,  the 
projected  confidence  limit,  is  contained  in  the  equation  of  the  ellipse 
thus : 


(L  cos  6 
P 


+ 


(L  sin 


2 


=  1 


Solving  for  L  gives 

L=/-£°l!jL  + 

V  p2 


sin26  y 

Q2  / 


FRMG:  Form  group  of  events 

This  function  formally  designates  a  temporary  event  subset  as  a 
"group".  Groups  are  logically  associated  events  which  will  serve  as 
training  sets  for  later  discrimination  or  whose  features  will  be  later 
compared  analytically  with  those  of  other  groups.  A  group  number  and 
symbol  are  provided  by  the  analyst  (the  symbol  serves  to  identify 
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members  of  this  group  throughout  all  later  stages  involving  graphics 
displays)  ,  The  analyst  also  enters  an  alphanumeric  string  to  describe 
this  group.  Checking  is  required  to  prevent  overwriting  a 
previously  -formed  group,  and  the  analyst  is  prompted  to  choose  another 
group  number  if  it  already  is  used  or  to  delete  the  previous  group  if 
desired.  The  group  association  number  is  set  in  the  master  event  common 
area  and  the  bookkeeping  records  on  groups  are  updated  to  reflect  the 
new  group.  Epicenter  symbols  on  the  map  (if  displayed)  ^re  changed  from 
the  temporary  symbol  to  the  designated  group  symbol. 

TRSG:  Transfer  a  group 

This  function  transfers  a  chosen  group  back  to  temporary  status  for 
editing  and  display  of  epicenters.  The  temporary  flags  in  the  master 
event  common  area  are  set  for  those  events  belonging  to  the  specified 
group.  The  analyst  can  subsequently  reform  the  group  with  the  command 
function  FRMG  or  leave  it  unchanged. 

DELG :  Delete  a  group 

This  function  deletes  a  specified  group.  All  variables  set  by  FRMG 
for  this  group  are  erased  and  any  subgroups  created  through  the  SDAT 
command,  as  described  later,  are  erased.  Epicenters  for  this  group,  if 
displayed,  are  deleted  from  the  graphics  display. 

SUMG:  Summarize  group  data 

This  function  summarizes  the  available  discrimination  data  for  each 
group  with  respect  to  station  or  variable.  This  enables  the  analyst  to 
see  exactly  what  data  is  available. 


The  available  data  for  all  specified  groups  is  tabulated  in  a 
triply-indexed  array  (group,  station,  and  variable).  A  summary  table  is 
prepared  by  counting  flags  for  the  variable(s)  or  station(s)  requested 
by  the  analayst.  Figure  5  illustrates  the  format  for  this  table.  In 
the  first  table  (by  station),  the  number  of  events  in  each  group  with 
the  variable  identified  as  "MB"  is  listed  for  all  stations  in  the  data 
base.  In  the  second  table  (by  variable),  the  number  of  events  for  which 
the  station  "ALE"  appears  with  all  of  the  possible  variables  in  the  data 


Figure  5 


VARIABLE  =  mb 

STATION  _ GROUP _ 

J_  JL  J_  i_  JL  _L  -L 

ALE  1  S  3  2  2  6  7 

BLA  2  7  3  3  5  5  4 

COM  2  t  4  4  4  3  2 


STATION  =  ALE 

VARIABLE  _ GROUP _ 

1  I  i  1  1  1  I 

M,  5563223 

mb  6  2  1  7  6  5  2 

FC  2334843 


Format  of  tables  which  summarize  group  data  by  station  or  bv 
variable. 
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base  is  listed  for  each  group. 

SPAT:  Formation  of  subgroups 

This  function  creates  subgroups  from  any  existing  groups.  The 
analyst  specifies  the  groups  for  which  the  subgroups  will  be  made.  He 
also  enters  stations  and  variables  to  be  considered  in  the  formation  of 
the  subgroups.  The  subgroup  data  sets  contain  variables  characterizing 
the  events  in  the  groups,  implying  an  averaging  of  magnitudes  over 
stations  to  produce  a  network  value  if  more  than  one  station  is 
requested.  The  subgroup  organization  of  variables  is  appropriate  for 

later  input  to  various  statistical  or  analysis  modules.  Subgroups 
pertaining  to  each  of  several  stations  can  be  formed  by  specifying  only 
one  station  in  successive  subgroup  formations. 

Subgroup  formation  proceeds  one  group  at  a  time.  Within  each  group, 
the  data  for  each  event  is  searched  for  the  specified  variables  and 

stations.  If  more  than  one  station  is  specified,  network  averaging  of 
variables  is  accomplished  by  the  method  of  Ringdal  (1976).  Any  event 
for  which  no  actual  measurement  of  a  given  variable  is  available  among 
the  specified  stations  will  have  an  "estimated"  value  placed  on  this 
given  variable  if  noise  measurements  are  available.  This  value  is  at 
the  5%  probability  level  (thus  an  upper  bound)  of  the  magnitude  for 
non-detection  of  the  associated  seismic  phases  at  all  stations  given 
their  noise  levels  (again  using  formulas  in  Ringdal).  A  flag  is  set  for 
the  variable  to  indicate  that  no  actual  signal  measurements  were 
involved  in  computing  it.  If  no  data  whatsoever  is  available,  the  flag 
is  set  to  another  value  and  a  zero  assigned  to  the  variable.  The 

subgroup  data  is  retained  in  core  and  pointers  created  to  identify  the 

starting  locations  for  each  subgroup.  The  arrangement  of  the  subgroup 
data  is  illustrated  in  Figure  6.  Entries  will  be  made  in  a  subgroup 
information  table  to  indicate  the  parent  group,  the  number  of  events  in 
the  subgroup,  and  the  stations  and  variables  represented  in  the 
subgroup . 

SCAL:  Variable  scaling 

This  function  removes  bias  present  in  the  variables  due  to  event 
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size.  This  is  required  when,  for  instance,  the  training  sets  of 
explosions  and  earthquake  have  significantly  different  sample  means  of 
bodywave  magnitude.  Empirical  scaling  relations  are  invoked  to  remove 
the  bias.  The  process  may  also  be  termed  "normalization"  of  the 
variables . 

The  analyst  specifies  the  subgroup  numbers  for  which  the  scaling  is 
to  be  performed.  A  set  of  scaling  coefficients  is  available  in  a  common 
area.  The  scaling  will  be  done  in  reference  to  a  particular  variable, 
body-wave  magnitude  being  the  recommended  reference.  Scaling  formulas 
are  of  the  linear  type: 


x '  =  ax  +  b 


where  x  is  the  value  of  the  variable  to  be  scaled  and  a  and  b  are 
coefficients.  Some  variables,  by  their  nature,  require  no  scaling 
(e.g.,  complexity)  and  will  be  unchanged  in  this  processing.  The  output 
is  the  replacement  of  the  variables  in  the  subgroup  data  array  by  the 
scaled  ones.  A  flag  will  be  set  in  the  subgroup  information  table  to 
indicate  that  scaling  was  performed. 

SIML:  Similarity  of  groups 

This  function  provides  the  analyst  with  a  rapid  quantitative  measure 
of  the  similarity  between  two  groups  of  events.  Optionally,  one  group 
may  contain  only  one  event;  and  in  this  case  the  likelihood  of  the  event 
belonging  to  the  specified  group  is  computed.  This  function  is  a 
simple,  but  less  informative,  means  of  classifying  events  than  the 
aoplication  of  full  multivariate  discrimination,  to  be  described  later. 


In  the  case  of  more  than  one  event  in  both  groups,  the  Mahalanobis 
distance 

2  T  -1 

r  =  (m1  -  m^)  E  (m1  -  m^) 

is  computed.  Here  m^  and  m^  are  the  vectors  of  group  means  of  the 
variable  vectors  x,  computed  thus 

I  1,1 

“l  "  ^*11 

i  "2 

m2  "  N2  i-l5*21 


-28- 


where  N1  and  are  the  number  of  events  in  each  group.  ;  Is  a  pooled 
covariance  matrix  computed  thus 


r  _  (Nx  -  1)  Ex  +  (N2  -  1)  E2 
Ni  +  N2  -  2 

from  the  two  group  covariance  matrics  E1  and  • 


The  quantity 


Ni 

Trpn 


2 

called  the  Hotelling's  T  statistic,  can  be  converted  to  an  F  statistic 
thus  (Morrison,  1976,  p.  137) 


p  _  %  +  N2  -  p  -  1  t2 

(Nx  +  N2  -  2)p 

where  p  is  the  number  of  variables.  This  quantity  is  compared  against 
standard  cumulative  F  tables  with  p  and  N1  +  N2  -  p  -  1  degrees  of 
freedom,  yielding  a  probability  P.  The  probability  of  the  two  groups 
belonging  to  the  same  population  is  then  given  as  1-P. 

In  the  case  of  only  one  event  in  one  of  the  groups,  the  test 
statistic  becomes  simply 

2  ,  ,T  . 

X  =  ( x t  -  m?)  r-2  (xt  -  m2> 


where  E  is  based  on  the  second  group  only.  This  quantity  is  compared 
2 

to  standard  x  tables  with  p  degrees  of  freedom,  yielding  a  probability 
P.  The  probability  of  the  single  event  belonging  to  the  second  group 
is  then  1-P. 

CLUS:  Clustering  of  events 

This  function  clusters  seismic  events  based  on  specified  measurement 
variables.  The  computational  process  is  contained  in  the  IMSL 
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subroutine  OCLINK  (IMSL.1980);  and  it  is  beyond  the  scope  of  this 
report  to  describe  it  fully.  Clustering  is  an  alternative  approach  to 
the  linear  discrimination  function  as  a  means  of  classifying  events. 
It  is  also  likely  to  be  invoked  to  determine  the  homogeneity  of 
training  event  sets  or  the  source-region  effects  on  discrimination 
variables . 

The  analyst  inputs  the  final  number  of  clusters  to  be  formed  from 
all  the  events  and  the  subgroup  numbers  from  which  these  events  are  to 
be  taken.  Options  for  different  clustering  algorithms  are  also  input. 
The  results  of  the  clustering  are  presented  on  the  A/N  terminal.  A 
number  is  set  in  the  master  event  common  area  to  indicate  the  cluster 
group  to  which  each  event  was  amalgamated. 

SPLT:  Reformation  of  groups 

This  function  forms  new  event  groups  based  on  the  clustering 
results.  Essentially  it  accomplishes  what  several  passes  through  the 
function  FRMG  described  above  would  do. 

XYPL;  Scattergrams 

This  function  provides  a  scattergram  on  the  graphics  terminal  of  any 
two  specified  variables  as  an  analysis  aid  in  determining  group 
characteristics.  Events  with  outlying  variable  values  can  be  readily 
identified  in  this  manner.  Variables  from  more  than  one  group  can  be 
displayed  simultaneously,  and  the  symbols  will  be  those  originally 
assigned  to  the  groups  by  the  FRMG  command  described  above. 

HGPL :  Histograms 

This  function  provides  a  histogram  on  the  graphics  terminal  for  any 
specified  variable.  This  allows  the  analyst  to  see  the  distribution  of 
that  variable  for  a  given  group,  or  optionally,  for  two  groups  such  as 
explosions  and  earthquakes. 

CPIS:  Multivariate  Discrimination 

This  function  forms  the  multivariate  discriminant  function  between 
two  or  more  groups  ("training  sets")  of  seismic  events.  The  processing 
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consists  largely  of  the  IMSL  subroutine  ODNORM  (IMSL,  1980).  All  the 
variables  in  the  specified  subgroups  will  be  used.  Those  events  having 
an  incomplete  set  of  variables  (the  case  of  flags  indicating  no  data 
available)  will  be  deleted  from  the  training  set  prior  to  computation. 
The  important  computed  quantities  are  the  group  classification  vectors 

=  £  -1  xk  k  =  1,  #  groups 

and  their  associated  constants 

T  _  i 

k  s  1 ,  0  groups 

where  xk  and  £  are  as  described  for  SIML  above.  The  vectors  and  Ck 
values  are  stored  in  a  discrimination  table,  which  indicates  the 
subgroups  involved.  An  indication  of  the  power  of  each  variable  for 
discrimination  among  groups  is  computed  through  IMSL  subroutine  ODFISH 
(IMSL,  1980)  and  displayed  on  the  A/N  terminal. 

ADIS:  Apply  Discrimination 

This  function  applies  the  group  classification  functions  computed  as 
described  above  to  one  or  more  events  of  unknown  type.  The  probability 
of  belonging  to  each  group  is  computed.  The  classification  scheme  is 
illustrated  in  Figure  7  for  the  case  where  network  variables  are  used; 
i.e.,  the  discrimination  variables  are  averaged  over  the  network  of 
stations  when  the  subgroup  data  is  formed  in  SDAT. 

The  analyst  has  the  option  to  specify  non-equal  prior  probabilities 
for  the  different  groups  to  which  the  unknown  event(s)  is  being 
classified . 

The  a  posteriori  probabilities 

Pk(x)  =  g*pOcDk+Ck) 

^[expCx'Di+Ci)] 

are  computed  for  classifying  the  unknown  event  with  variable  vector  x  to 
each  of  M  groups.  These  probabilities  are  displayed  to  the  analyst. 
The  classification  values  for  each  event,  as  represented  by 


x  •  Dk  +  Ck  k  =  1 ,  M  groups 
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Figure  7.  Scheme  for  event  classification  using  network  averaged  variables. 
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will  be  stored  in  a  classification  table  for  later  reference  during 
execution  of  the  SDIS  command  described  next. 

SDIS;  Sum  discriminants 

Sometimes  the  analyst  may  wish  to  compute  classification  functions 
for  individual  stations  of  a  network  and  then  apply  these  to  the  unknown 
events  separately.  The  network  classification  values  are  then  the  sum 
of  the  individual  station  values.  This  scheme  is  illustrated  in  Figure 
8,  and  this  command  performs  the  final  step  of  that  illustration. 
Figure  8  should  be  compared  to  Figure  7,  and  a  discussion  of  the 
relative  merits  of  the  two  discrimination  approaches  is  given  in  Rivers 
et  al .  ( 1981 )  . 

The  analyst  can  optionally  specify  weights  w^  for  each  of  the  L 
stations  used.  The  summed  classification  values  are  given  by 

L 

j  1  wj[xj‘DJk+Cjk] 

For  each  unknown  event,  a  posteriori  probabilities  are  computed  as 
described  for  the  ADIS  command,  except  the  above  expression  replaces 

-  '  Dk  ♦  V 

LSTG:  List  Groups 

This  command  produces  a  listing  on  the  A/N  terminal  of  the 
attributes  of  each  group  formed  in  the  analysis  session;  i.e.,  the 
selection  criteria,  the  event  type,  and  the  number  in  each  group. 

LSTS;  List  Subgroups 

Similar  to  LSTG,  this  command  produces  a  listing  of  all  the 
subgroups  formed  in  the  analysis  session,  with  the  stations  and 
variables  which  defined  them. 

LSTD ;  List  Discriminants 

This  command  produces  a  listing  on  the  A/N  terminal  of  all  the 
pertinent  information  concerning  computed  classification  functions. 
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DATA  BASE  AND  SUPPORT 


Master  Bulletin  File 

The  Regional  Event  Location  System  (RELS)  being  designed  under 
separate  contract  at  the  SDAC  will  produce  an  output,  called  the  EAF 
(Event  Arrival  File),  which  can  properly  be  termed  a  seismic  buletin. 
It  will  contain  event  information  and  data  for  seismic  phases  associated 
to  each  event.  The  format  of  the  EAF  is  as  presented  in  "Data  Base 
Specifications — Seismic  Research  Information  System,  "  available  at  the 
SDAC.  It  is  not  yet  known  how  the  RELS  bulletin  will  be  mass  stored. 
In  any  case,  the  analyst  will  only  want  a  portion  of  it  for  any  given 
session.  It  is  foreseen  then  that  an  offline  job  will  prepare  a  subset 
of  the  full  bulletin  to  be  used  as  the  actual  input  to  the  DISE  program. 
It  is  from  this  bulletin  that  the  master  variable  file  and  master  event 
common  area  would  be  created  during  the  INIT  command  of  DISE.  However, 
it  is  again  pointed  out  that  presently  the  INIT  command  is  automatically 
invoked  and  merely  accesses  the  master  variable  file,  which  has  been 
prepared  offline. 

Master  Variable  File 

This  is  a  random-access,  unformatted  file  containing  the  variables 
to  be  used  in  the  analysis  session.  The  format  was  given  in  Table  2 
earlier.  The  entries  are  arranged  by  event  in  this  file,  and  pointers 
are  maintained  in  the  master  event  common  area  to  access  the  data.  The 
file  length  will  vary,  but  is  roughly  given  by  ^events  x  ^stations  x 
^variables  x  22  bytes,  where  "(^stations"  implies  the  average  number  of 
stations  per  event  and  "^variables"  implies  the  average  number  of 
variables  per  station. 

Station  Constant  File 

The  station  constant  file  contains  location  data  on  the  seismic 
stations  and  descriptions  of  their  sensors.  This  file  is  available  on 
the  VAX-11/780  and  is  adequate  for  the  DISE  program.  The  file  length  is 
presently  roughly  100K  bytes. 
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World  Map  File 


The  world  map  file  will  actually  consist  of  many  files.  One  large 
file  holds  the  entire  set  of  digitized  world  coastlines.  A  group  of  648 
separate  files  hold  10  degree  by  10  degree  segments  of  this  world 
coastlines  map.  Similar  files  exist  for  the  political  boundaries.  The 
total  length  of  these  files  is  approximately  130K  bytes. 

Data  Limitations 

The  virtual  memory  feature  of  the  VAX-11/780  obviates  much  of  the 
I/O  for  temporary  data  sets  required  by  most  other  existing  mainframe 
computers.  Thus,  the  DISE  program  has  been  coded  to  maintain  extensive 
data  arrays  in  common  areas  and  in  work  areas  unique  to  each  module. 
However,  current  dimensions  allow  for 

9  groups 

9  subgroups  per  group 
200  events  per  group 
30  variables 
30  stations 
1000  events 

In  addition,  the  length  of  the  vector  holding  the  data  arranging  by 
subgroup  as  in  Figure  6  is  50,000. 
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